/ 


N95- 19749 




A Parallel Strategy for Implementing Real-Time 
Expert Systems Using CLIPS 


Laszlo A. Ilyes & F. Eugenio Villaseca 
Cleveland State University 
Dept, of Electrical Engineering 


1983 East 24th Street 
Cleveland, Ohio 44115 


John DeLaat 

NASA Lewis Research Center 
21000 Brookpark Road 
Cleveland, Ohio 44135 


ABSTRACT 

As evidenced by current literature, there appears to be a continued interest in the study of real- 
time expert systems. It is generally recognized that speed of execution is only one consideration 
when designing an effective real-time expert system. Some other features one must consider are 
the expert system’s ability to perform temporal reasoning, handle interrupts, prioritize data, 
contend with data uncertainty, and perform context focusing as dictated by the incoming data to 
the expert system. 

This paper presents a strategy for implementing a real time expert system on the iPSC/860 
hypercube parallel computer using CLIPS. The strategy takes into consideration, not only the 
execution time of the software, but also those features which define a true real-time expert 
system. The methodology is then demonstrated using a practical implementation of an expert 
system which performs diagnostics on the Space Shuttle Main Engine (SSME). 

This particular implementation uses an eight node hypercube to process ten sensor measurements 
in order to simultaneously diagnose five different failure modes within the SSME. The main 
program is written in ANSI C and embeds CLIPS to better facilitate and debug the rule based 
expert system. 


INTRODUCTION 

Strictly defined, an expert system is a computer program which imitates the functions of a human 
expert in a particular field [1]. An expert system may be described as a real-time expert system if 
it can respond to user inputs within some reasonable span of time during which input data remains 
valid. A vast body of recently published research clearly indicates an active interest in the area of 
real-time expert systems [2-12]. 

Science and engineering objectives for future NASA missions require an increased level of 
autonomy for both onboard and ground based systems due to the extraordinary quantities of 
information to be processed as well as the long transmission delays inherent to space missions. 
[13]. An expert system for REusable Rocket Engine Diagnostics Systems (REREDS) has been 
investigated by NASA Lewis Research Center [14, 15, 16]. Sequential implementations of the 
expert system have been found to be too slow to analyze data for practical implementation. As 
implemented sequentially, REREDS already exhibits a certain degree of inherent paraUelism. Ten 
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( Value "How are you ?" ) 

) 

(defrule RECEIVE 

) 

(declare (salience -10000 ) ) 

; Send the buffer to our new friend? 

?REC <- ( RECEIVE ) 

( BMSend ?Identity ?BufRef ) 

=> 

; Destroy the buffer 

( BMReceive ) 

( retract ?REC ) 

( BMDestroy ?BufRef ) 

^ (assert( RECEIVE)) 


Figure 5. A CMS Sample 


CONCLUSION 

PVM and CLIPS both provide free source code systems that are well maintained by developers 
and a sizable number of users. Relative few source code changes are necessary to either system 
in order to build a reliable and robust platform that will support distributed computing in a 
heterogeneous environment of CPUs operating under UNIX. The CMS system described in this 
paper provides the CLIPS interface code and some parsing code sufficient to enable efficient use 
of PVM facilities and communication of CLIPS facts and templates among C, C ++ , and CLIPS 
processes within a PVM virtual machine. Even more efficient communication can be obtained 
through enhancements to the PVM source code that can provide more efficient allocation of 
memory and reuse of PVM message buffers in certain applications. 

NOTES 

Information on PVM is best obtained by anonymous ftp from: netlib2.cs.utk.edu 
Shar and tar packages are available from the same source. 

The authors are currently using the CMS system in applications that involve multiple CLIPS 
expert systems in sophisticated interactive user interface settings. It is expected that the basic 
CMS code will become available in the Spring, 1995. Inquiries via e-mail are preferred. 
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sensor measurements are used to diagnose the presence of five different failures which may 
manifest themselves in the working SSME. Each module of code which diagnoses one failure is 
referred to as a failure detector. While some of the sensor measurements are shared between 
failure detectors, the computations within these detectors are completely independent of one 
other. 

One apparent way to partition the problem of detecting failures in the SSME, is to assign each 
failure detector to its own node on the hypercube system. Because the failure detectors may be 
processed simultaneously, a speedup in the execution is expected. But while execution time is a 
critical parameter in any real-time expert system, it is not the only ingredient required in order to 
guarantee its success. A recent report characterized the features required of expert systems to 
operate in real-time. In addition to the requirement of fast execution, the real-time expert system 
should also possess the ability to perform context focusing, interrupt handling, temporal 
reasoning, uncertainty handling, and truth maintenance. Furthermore, the computational time 
required by the system should be predictable and the expert system should potentially be able to 
communicate with other expert systems [17]. These aspects are considered in the design 
presented in this paper. 


METHODS 

The rules for diagnosing failures in the SSME were elicited from NASA engineers and translated 
into an off-line implementation of a REREDS expert system [18]. While some of the failures can 
be diagnosed using only sensor measurements, other failures require both data measurements and 
the results obtained from condition monitors. The condition monitors measure both angular 
velocity and acceleration on various bearings of the High Pressure Oxidizer Turbo-Pump 
(HPOTP) shaft and determine the magnitudes of various torsional modes in the HPOTP shaft 
[19]. Due to the lack of availability of high frequency bearing data and additional hardware 
requirements for implementing real-time condition monitors, this expert system considered only 
those failure detectors which required sensor measurements alone. 

The five failure detectors which rely solely on sensor measurements for diagnosis are listed in 
Table 1 along with a description of the failure, the required sensor measurements, and their 
respective, relative failure states. Notice that failure detectors designated Fll and FI 5 cannot be 
differentiated from one another and are thus combined into one single failure mode. 

For each sensor measurement listed, the expert system knowledge base is programmed with a set 
of nominal values and deviation values (designated in our work by o). One of the roles of the 
expert system is to match incoming sensor measurements with the nominal and deviation values 
which correspond to the specific power level of the SSME at any given time. Any sensor 
measurement may deviate from the nominal value by ± a without being considered high or low 
relative to nominal. Beyond the a deviation, the sensor measurement is rated with a value which is 
linearly dependent upon the amount of deviation. This value is referred to as a vote and is used by 
a failure detector to determine a confidence level that the failure mode is present. This voting 
curve is illustrated in Figure 1. 
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Once a vote has been assigned to every sensor measurement, each failure detector averages the 
votes for all of its corresponding sensor measurements. The final result will be a number between 
-1.00 and +1.00. This result is converted to a percent and is 


Failure 

Detector 

Description 

Measurements 

Failure 

States 

FI 1/15 

Labyrinth/Turbine Seal Leak 

LPFP Discharge Pressure 

Low 



FPOV Valve Position 

High 



HPFTP Turbine Discharge Temp. 

High 

F67 

HPOTP Turbine Interstage & 

HPOTP Discharge Temperature 



Tip Seal Wear 

HPOP Discharge Pressure 




HPOTP Shaft Speed 




MCC Pressure 


F68 

Intermediate Seal Wear 

Secondary Seal Drain Temperature 




HPOTP Inter-Seal Drain Pressure 


F69 

HPOP Primary Seal Wear 

HPOP Primary Seal Drain Pressure 




Secondary Seal Drain Pressure 

Kggjl 



Secondary Seal Drain Temperature 


F70 

Pump Cavitation 

HPOP Discharge Pressure 

Low 



HPOTP Shaft Speed 

Low 



MCC Pressure 

Low 


Table 1. - Failure Detectors Only Requiring Sensor 
Measurements for Failure Diagnosis 



Figure 1. - Voting Curve for Sensor Measurement with “High” Failure State 


referred to as the corresponding confidence level of that failure mode. The underlying motivation 
for this approach is to add inherent uncertainty handling to the expert system. 
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Each individual failure detector was implemented in CLIPS on a personal computer and its 
accuracy was tested and verified using simulated SSME sensor data. Once satisfactory results 
were achieved, an ANSI C program was written for the iPSC/860 hypercube computer which 
would initialize the CLIPS environment on five nodes of a 2 3 hypercube structure. These five 
nodes, referred to as the failure detector nodes, load the constructs for one failure detector each, 
and use CLIPS as an embedded application as described in the CLIPS Advanced Programming 
Guide [20]. In this way, CLIPS will only be used for evaluation of the REREDS rules. All other 
programming requirements, including opening and closing of sensor measurement data files, 
preliminary data analysis, and program flow control are handled in C language. By embedding the 
CLIPS modules within ANSI C code, context focusing and process interruptions can be more 
efficiently realized. 

Coordination of data acquisition and distribution among the failure detector nodes is 
accomplished through a server node which is programmed to furnish sensor measurement data to 
requesting nodes. Since the data for this study originate from the SSME simulator test bed, data 
retrieval is accomplished simply by reading sequential data from prepared data files. The server 
node transfers incoming sensor measurements into an indexed memory array, or blackboard, from 
which data are distributed upon request to the failure detector nodes. When the blackboard is 
updated, all requests for data are ignored until data transfer is completed. This assures that 
reasoning within the expert system is always performed on contemporaneous data. The server 
node does not invoke the CLIPS environment at any time. It is programmed entirely in C 
language code. 

One additional node, referred to as the manager node, is used by the expert system to coordinate 
the timing between the failure detector nodes and the server node. Like the server node, the 
manager node does not invoke the CLIPS environment. Once the manager node has received a 
“ready” message from all failure detector nodes, it orders the server node to refresh the data in the 
blackboard. During this refresh, the failure detector nodes save their results to permanent storage 
on the system. The activities and process flow of all three types of nodes used in this research are 
illustrated in Figure 2. The asterisk denotes the point at which all nodes synchronize. 


CONCLUSION 

Profiling studies were conducted on the parallel implementation of the REREDS expert system. It 
was found that the system could process the sensor measurements and report confidence levels 
for all five failure modes in 1 8 milliseconds. A sequential implementation of the expert system on 
the same hardware was found to require over 50 milliseconds to process and report the same 
information, indicating that the parallel implementation can process data at nearly three times as 
quickly. Considering the fact that seven processors are being used in the parallel implementation, 
these results may seem somewhat disappointing, however, the profiling studies also indicate that 
additional speedup can be realized in future implementations of this expert system if the data 
blackboard is also parallelized. Using only one server node causes some hardware contention. 
Shortly after the nodes synchronize, the failure detectors tend to overwhelm the server with five 
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SERVER 


FAILURE 

DETECTOR 



MANAGER 



(c) 

Figure 2. - Process flow for the a) Server Node, b) Failure Detector Nodes, and c) Manager Node 
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(nearly) simultaneous data requests. By adding a second server node to the system, this 
contention can be greatly reduced. 

Since the data can be processed at a fast, continuous rate, the validity of sensor measurements can 
be assured during processing. Consequently, truth maintenance is realized by suppressing data 
requests to the server node until all sensor measurements have been simultaneously updated. This 
guarantees that all data accessed by the failure detector nodes during any processing cycle is the 
same “age.” 

Due to the nature of the particular expert system selected for this research, the time required by 
the failure detectors to process SSME data remains constant regardless of whether or not a failure 
condition exists. Thus, predictability is always assured for this example. Also, the need for 
temporal reasoning is not explicitly indicated and is therefore not investigated. Since these aspects 
of the design are application specific, they and must be investigated in future work using different 
expert system models. 

As discussed earlier, uncertainty handling is inherent to this expert system. The voting scheme and 
use of confidence levels permits reasoning, even in the presence of noisy, incomplete, or 
inconsistent data. Since the output from the system is a graded value rather than a binary value, 
the output carries with it additional information about the expert system’s confidence that a 
particular failure is occurring. 

One of the most important features of this design is that program flow control and system I/O is 
accomplished in C language code. Using CLIPS as an embedded application within a fast, 
compiled body of C language code allows the expert system to be more easily integrated into a 
practical production system. Complex reasoning can be relegated directly and exclusively to the 
nodes invoking the CLIPS environment, while tasks which are better suited to C language code 
can be performed by the server and manager nodes. Thus, simple decisions can be realized quickly 
in C language rather than relying on the slower CLIPS environment. Based on fast preprocessing 
of the sensor measurements, the C language code can be used to initiate process interrupts during 
emergency conditions and even change the context focusing of the expert system. Those tasks 
which require complex reasoning can be developed and refined separately in CLIPS, taking full 
advantage of the debugging tools available in the CLIPS development environment. 

While the rules for this particular expert system are somewhat simple compared to other 
applications considered in the literature, it is believed that the approach used in this study can be 
extended to other examples. This study demonstrates that parallel processing can not only speed 
up the execution of certain expert systems, but also incorporate other important features essential 
for real-time operation. 
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